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Abstract 

A direct integration algorithm is described to compute the magnetostatic field and en- 
ergy for given magnetization distributions on not necessarily uniform tensor grids. We use 
an analytically-based tensor approximation approach for function-related tensors, which re- 
duces calculations to multilinear algebra operations. The algorithm scales with N^/^ for N 
computational cells used and with N"^/^ (sublinear) when magnetization is given in canonical 
tensor format. In the final section we confirm our theoretical results concerning computing 
times and accuracy by means of numerical examples. 
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1 Introduction 

Computation of the magnetostatic field is the most time-consuming aspect in micromagnetic 
simulations. This is usually done by evaluating the magnetostatic scalar potential, which in- 
volves the solution of a Poisson equation, e.g., by means of a hybrid FEM/BEM method, see 
e.g. [E]. Alternatively, direct computation by discretization of a volume integral formulation 
of the scalar potential normally scales with the total number of computational cells squared, 
i.e. 0(n®) for cells. Several techniques have been introduced in the literature to reduce com- 
putational costs, e.g. the fast multipole method (combined with FFT), [6], [IT], NG methods, 
|16j . scaling linearly, i.e. O(n^) and H-matrix techniques, [T9j, with almost linear complexity, 
i.e. O(n^logn). Recent developments show sublinear compression properties, i.e. O(n^), both 
for storage requirements and computational complexity by applying multilinear algebra approx- 
imation techniques to the demagnetizing and magnetization tensors [10]. For this purpose the 
magnetization tensor has to be represented or approximated in canonical tensor format [14j, 
and a non-regular grid cannot be used. Some further difficulties appear when a purely algebraic 
approach is used for tensor approximation, e.g., the best low-rank approximation problem is 
not well-posed and globally convergent algorithms do not exist so far [8], [14j . 
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In Sec. [51 we present an analytically-based tensor approximation method, which can be 
generally used for a special class of function-related tensors [11]. Calculation of the scalar po- 
tential and the stray field reduces to multilinear algebra operations, which can be implemented 
efficiently using optimized libraries [18], |2], [3]. Magnetization can also be treated as low-rank 
tensors and the data-sparse format is preserved by the method, which scales almost linearly, 
i.e. 0{n'^) in the general case and sublinear, i.e., O(n^) for specially structured magnetization 
tensors, e.g. in CP format (see[X]) which was also used in [1^. Using canonical tensor formats 
could open up new possibilities for solving the variational model by Landau-Lifschitz for sta- 
tionary micromagnetic phenomena efficiently by projection methods (e.g., CG or GMRES) for 
linear systems in tensor format, recently introduced in [4J. 

Because of the sublinear scaling of tensor-grid methods, micromagnetic methods that compute 
magnetization dynamics or hysterisis properties (whereby the magnetization is in the CP for- 
mat) have a high potential for solving large scale engineering problems. The aim of this work 
is to provide a key-building block of such an algorithm: The computation of the magnetostatic 
energy from magnetization distriubutions given in CP format. 

In Sec. E] we test our algorithm by computing the magnetostatic potential, field and energy 
of hexahedral ferromagnetic bodies for different given magnetizations. 



2 Method 

2.1 Analytical preparations 

The magnetostatic scalar potential in a ferromagnetic body C M'^ induced by a given magne- 
tization distribution M is usually given by the formula [12] 

and the stray field then reads 

Hd = -V0. (2) 

We aim for computing the scalar potential by means of multilinear tensor operations and there- 
fore prefer discretizing a volume integral instead of the formulation from Eq. ([1]). Integration 
by parts leads to 

(x) = ^ [ M(x') • d^x'. (3) 

47r./o Ix-f'P 



We denote the three volume integrals in Eq. ([3]) by 

i^\x)= / A'&Xx') ^ 3 dv, (4) 

Jq \x — x'\ 

for each of the components M^P\ p = 1 ... 3, of the magnetization M, so Eq. ([3|) reads 

4 vr ^ — ' 

p=i 

As a first step, in order to get rid of the singularities at x' = x, we represent the integral kernel 
in dH as an integral of a Gaussian function by the formula 

1 2 



P 



T^e-^PdT. (6) 
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So one has, for p =\x — x'p, 

&\x) = ^ [ [ e-^'\'^-^'\^ M^P\x'){x^^^ - x'^''^)d^x' dr. (7) 

Eq.© reduces the computation to independent integrals along each principal direction. As we 
will see later, this results in a reduction of computational effort from 0{N'^) to 0{N^/^). 

The rest of the paper is organized as follows. First we discuss the r - integration in Eq. ([7]) 
via sine- quadrature. Then we consider the spatial discretization of the resulting quadrature ap- 
proximation on tensor grids and discuss its computational realization. In Sec. [3] we present some 
numerical results on computational complexity and accuracy for different given magnetizations. 

2.2 r - Integration 

Although the singularity is gone, the numerical treatment of ([7]) is not straightforward. Small 
values of p have a dispersive effect on the integrand in ([6]), so one has to distribute the quadrature 
nodes over a wide range for accurate approximation of the kernel function 1 / p'^/^ . These values 
of p correspond to a small grid size, which is again essential for an accurate approximation of 
the magnetic scalar potential. Therefore one has to use a quadrature rule that is robust with 
respect to small values of p. 

For Gaussian quadrature, weights and nodes can be computed by solving an eigenvalue 
problem of symmetric tridiagonal type. For a larger number of quadrature points one uses the 
QR algorithm, which scales linearly in the number of quadrature terms. On the other hand, 
using Gaussian quadrature formulas on a finite subinterval [0, j4], or Gauss-Laguerre quadrature 
over the infinite interval in 1^ fail in terms of achieving a sufficiently good representation of 
the kernel function for small values of p. 

In |13] an integral representation for the Newton potential, i.e., 1/p, was used to compute the 
electrostatic potential for given Gaussian distributed charges in whole space. The r - integration 
was performed using the Gauss-Legendre formula on logarithmically scaled blocks of the interval 
[0, 10^] using a total of 120 quadrature points. Due to geometrical refinement against zero, the 
functional 1/p is well described in this region. 

Here we use the exponentially convergent sine- quadrature for numerical integration of the 
integral }11] . This method shows better approximation properties than the Gauss-Legendre 
formula for much fewer quadrature terms. 

Abbreviating notation and exploiting symmetry in r, the /^P-* take the form 

/(P)(x) = 4=/ T'^2F^P\x,T)dT, (8) 
V^r Jo 

where F^p^ stands for the Jl- integral in ([7]). In order to guarantee the above mentioned ex- 
ponential convergence rate we perform an integral transform on Eq. ([8]), i.e. r = sinh(t), and 
apply numerical integration afterwards, which gives 

poo ^ 

/ T^2F^P\x,T)dT^y^u;ismHtifG^P\x,ti), (9) 
•^0 1=1 

where {ti,uji) are the nodes and weights of the underlying quadrature, and G^P\x,t) = 2 F^p\x , sinh{t)) . 
To apply sine- quadrature we use the R + 1 nodes and weights given by 

ti = IhR (10) 
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Table 1: Average abs./rel. errors of sinc-quadrature on p-intervall [5e — 05, le — 02]. Left: 
Co-dependence of approximation of ^ {R = 50). Right: i?-dependence of approximation of ([6]) 
(co = 1.85). 
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and 



(11) 

1 2 /iR cosh(t,) / > 0, 



with hji = Co ln{R)/R for some appropriate co, see Proposition 2.1 in [TT] . 

Tab. [T] shows the average absolute and relative errors due to sinc-approximation of the 
functional ([6]) for 10^ equidistantly chosen p-values of the interval [5e — 05, le — 02] (which 
corresponds to a mesh size of 10~^ up to 200~^ in a uniform tensor grid, see Sec. 12. 3p . The left 
three columns show accuracy for different values of the parameter cq and R = 50, right columns 
show dependence of the number of quadrature terms and cq = 1.85. 

For our numerical experiments in Sec. [3l we use co = 1.85, which gives a sufficiently good 
description of the functional ([6]) . 

One could think about optimizing cq by minimizing the number of quadrature terms for a given 
accuracy of the description of the functional ([6]), but we do not address possible algorithmic 
realizations of this topic in this paper. 



2.3 Discretization on tensor grids 

For the sake of simplicity let us first assume a uniformly discretized cube fi, where the tensor 
product grid consists of subcubes ilj, j := (j'l, J2) js)- We make the assumption of constant 
magnetization for each spatial component p = 1 ... 3 in each subcube, i.e. 

M^P^=^m'f\n,, (12) 
j 

(p) 

where xn- = Xn, Xn. Xn, is the 3-d characteristic function of the subcube Qi, and m. the 
components of the 3-d p-component magnetization tensor M^^\ 

The computational realization of the quadrature approximation Q to Eq. d?]) requires eval- 
uation of the scalar potential at the center point x? = (xf^ , x?^ , ) of each field cell. To this 
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end we substitute Eq. 1^ into the function G^p^ of Eq. p. This leads to 

3 „ 

G^^HxlU) = n / 9^'Hxl,x',ti)xn,^{x')dx', (13) 

i q=l 



where 



I [a — a ) exp[— smh[T) [a — a j ) q = p- 
The three integrals in Eq. ()13p define (n x n) - matrices, i.e. 

di„j,-= 9{xl,x',ti)dx', (15) 
So we have a Tucker representation of the function G^^^ (see[X|), i.e., 

G(^) (xf , to = E .3 4 .1 4 4 .3 (17) 
j 

= M(f) Xi^i' XsD^ XgDj, (18) 

with the core tensor M^^^. 

2.4 Magnetization in CP format 

The main goal of this paper is the development of an algorithm for magnetostatics that allows 
the treatment of magnetization in tensor low-rank formats and also preserves this format. In 
the following, we can see how the preparations of the previous sections make it possible to 
compute the potential for CP-magnetization. 

Combining Eqs. ([3]), ([9]) and (|17p yields the scalar potential at the center points 

^ 3 R 

^nxnxn ^ ^ ^ sinh(t, )2 M^^) X i X2 X3 D^. (19) 

p=l 1=1 

Assuming M^^^ G Cn,rp, see El shows p9p to be in canonical format as well, i.e. 

3 R 



p=i 1=1 " 



where Mg^^ G M"^''p, A^^') G Wp, q,p = 1 ... 3 and the factors coi sinh(t;)^ are absorbed by the 
weight vectors A*-^^ 



2.5 Non-uniform grids 

Let now ^7 be a non-uniformly partitioned cube, i.e. 

3 

i^=ui^j, i^j=n%' (21) 

jeJ i=i 
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where J C N"]^ is a set of multiindices. 

For the following we define Up := maxjg j jp, and the extension J of J by 

J ■■={j\jp = l...np, p = 1...3}. (22) 

Similarly, the magnetization (|12|) is then given by 

M(P^ = Y,mfxn,, (23) 
jeJ 

where we set mj^^ = for j G J \ J. 

The p-component magnetization tensor is thus formally an element of M"!^"^^"^, and the 
remaining approach is similar to that in Sec. 12.31 The matrices in (|15p are now of different 
dimensions for each component q, i.e., G ]R"9X"<7. The magnetization tensors might now 
be sparse, so for the mode-multiplication in ()17p this can be taken into account to reduce 
computational costs [3]. 

2.6 Computational issues 

For evaluating ()17p we use Gauss-Legendre quadrature with not more than 50 terms for the 
Gaussian integrals in (jlSp . 

In the general case one q-mode matrix multiplication for the p-component tensor M*-^-* in ()17p 
requires n matrix-matrix products and thus can be performed in O(n^) operations [3]. Since 
for mode multiplication the n matrix products along the corresponding mode are independent, 
it is easy to perform it in parallel, which reduces computation time, when using current state 
of the art computer architecture. 

When further special structure for the p-component magnetization tensors or the matrices 
in ()17p is given, costs reduce significantly. If M*-^^ is assumed to be in canonical format, i.e. 
M*-^^ G Cn,r, (as in [lOj), costs for the q-mode matrix multiplication reduce to 0{rn^) (see 
Eq. (03]) or [3]), where r stands for the Kronecker rank. 

Thus, Eq. (|17p scales O(n^) in the general case respectively O(n^) for canonical magnetization 
and is computed for each /, which does not depend on the grid spacing for robust r- quadrature 
described above. Therefore the number of r- quadrature terms have no influence on compu- 
tational costs with respect to n. Summarizing, the total operation count for ([19]) and (pO]l 
becomes approximately 

3R{g/n'^ + 3)n'^ for ([l9]), and (24) 

3 

3R{g + J2 for & > (25) 

p=i 

where g and R denote the number of Gaussian- and sinc-quadrature terms, respectively. 

Although we do not intend to create an alternative to existing algorithms for the computa- 
tion of pairwise charge or mass interactions with linear complexity in N = n"^ (e.g. FMM [5j or 
NG methods [16]) rather than presenting a way for the computation of magnetostatic poten- 
tials/fields for data-sparse CP-magnetization, we briefly want to compare in terms of operation 
counts. 

Asymptotic operation counts for different versions of FMM can be found in the literature. For 
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instance, in [S] the fastest version in three dimensions (exponential translation) counts approx- 
imately 200Np + "i.bNp^ operations, where N is the total number of cells and p = log^l/e 
(number of multipoles), for given accuracy e of the multipole expansion (the average amount 
of particles in one box at the finest level is here s = 2p for sake of convenience). Chosing 
p = 30, for ten digits of accuracy, gives approximately 10^ operation for the calculation of 
the magnetic scalar potential due to pairwise interactions. As can be easily recalculated, this 
clearly outperforms (j24p (e.g. for grids larger than N = 20'^, with R = g = 50), but is itself 
outperformed by ()25p for any grid size, when assuming rp < n. 

At this point it is also worth mentioning that in the case of CP-magnetization with Vp <n the 
storage for the magnetization tensors is compressed by a factor of X]p=i TpjT? <?>ln. We also 
want to refer to Sec 131 where the Eqs. ()24p and ()25p are confirmed by numerical examples, as 
well as, the accuracy of the introduced approach is discussed. 

The resulting /^^^ are 3-d tensors defined on the tensor product grid. Once the potential has 
been computed, one has to perform discrete differentiation to obtain the field (12]). This can be 
done by q-mode sparse matrix multiplication, which scales O(n^) for ()19p . as does q-mode vector 
multiplication, and 0{n'^) for the canonical version (|20p . see 13]. Here we use a sparse finite- 
difference matrix corresponding to a three-point finite-difference approximation of order 2 for 
the first derivative. Assuming a (not necessarily uniform) mesh in one spatial direction p, e.g., 
with mesh sizes hj,j = 1 . . . n, one has to use general finite-difference approximations. Since the 
potential is only given at the center points, we first denote by hj := {hj + hj+i) /2, j = 1 . . . n — 1, 
the distance between successive midpoints. For interior points the corresponding second order 
centered finite-difference approximations are given by 
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where hk,hi are the distances to the left and right neighbor of a midpoint x, respectively. For 
the boundaries we use the analogous one-sided scheme 
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The resulting finite-difference matrix with respect to the p - th spatial direction is then given by 
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(28) 



The tensor $ representing the scalar potential on the center points of the field cells, is given 



by the entries 



and the stray field can now be computed by evaluating the 
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3-component tensor 



( ^XlJn 

<I>X2 J2 I . (29) 



Furthermore, the demagnetizing energy is given by [15 



i?demag = -7 / Hrf • M d^X. (30) 

On a uniform mesh, a rough estimate for -Edcmag is obtained by midpoint quadrature, 

3 

P=l j 

where V^eii = ^/n^, and * denotes the Hadamard tensor product [T^. In the case of a non- 
uniform mesh, the midpoint approximation reads 

3 

ii^demag « f E E ^ * ^^"^ * >^p J^, (32) 
P=l j 

where V denotes the volume tensor containing the volumes of the computational cells as entries. 

In the case where the p-component magnetization tensors are given in canonical tensor 
format, the functions G^^^ from Eq. ()17p are in the same format and therefore, from the additive 
structure of Eqs. Q and (0), one recognizes $ to be given in canonical format (see Eq. (|2U|) 
and[Aj). In this case efficient mode multiplication can be applied in ()29p . see [3] for details. The 
components of the stray field are then in canonical format, which enables evaluation of (j3ip by 
using inner products , i.e. 



E M(p) * hJ^ = (vec(M(P)), vec(Hj'M , (33) 



where vec(-) denotes vectorization. The inner product (•, •) can be performed in canonical format 
at a cost of merely 0{RmRh n) operations; see lAl and [3] for details. (Here, Rm,Rh denote 
the Kronecker ranks of M^^) and H^''^) Additionally Eq. ^ can be carried out in CP-format 
since the volume tensor for tensor product grids can be considered as a rank-1 tensor. 

3 Numerical results 

We use MATLAB v7.11 for our computations, including the Tensor Toolbox [2]. All timings 
are reported for a Linux Workstation with a Quad-Core Intel 17 processor and 6 GB RAM. 

3.1 Accuracy of the magnetic scalar potential 

Our approach for computing the scalar potential yields formulae (|19p and ()20p . respectively. 
As indicated in the previous section this has almost linear effort, i.e. 0{N^/^), in the general 
case and sublinear effort, i.e. 0{N'^^^), for magnetization in CP format. It is notable that our 
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Table 2: Comparison of accuracy of Eqs. and (fTUj) {R = 50). Absolute (-Errxo/so) and 

relative (i?elerriQ/5o) L2-errors for a N = 10^ (exact errors) and = 50^ (errors for 50 randomly 
chosen mesh-points) tensor product grid, g'^^'^ and g^'^^ indicate the order of Gaussian quadrature 
used for the integrals in ()34p and ()15p . 
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Table 3: Comparison of accuracy of Eqs. (fM|) and (fT9|) (R = 50). Absolute (ii^rr 10/50) and rela- 
tive (i?e/errxo/5o) -^^2-errors for a A^ = 10^ (exact errors) and A^ = 50^ (errors for 200 randomly 
chosen mesh-points) tensor product grid, g'^" indicates the order of Gaussian quadrature used 
for the integrals in ()15p . the integrals in ()34p are evaluated exactly. 
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approach leads to the same accuracy as direct integration of Eq. ([3]) with complexity 0{N'^), 
i.e. 
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(34) 



This is due to the robust sine- quadrature with respect to p (see Tab. [T]) that we used for 
representing the kernel function by Eq. ([6]). 



For demonstration we assumed random magnetization uniformly distributed in the interval 
(— 1, 1) on a tensor product grid. We use the absolute and relative L2-errors as measurements 
for the accuracy, i.e. 
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where ^>*^''^ and '^^'^^ are the tensors representing the potential at the center points of the com- 
putational cells obtained by the direct formula and the tensor approach (fT9]) , respectively. 
Tab.[2]shows that both have the same accuracy level for equal orders of Gaussian quadrature for 
tensor product grids with A^ = 10'^ and A^ = 50^. Note that the integrands in (j34p get singular 
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for X = x' , where x denote the centers of field cells. The contribution of the singular integrals is, 
however, zero, since the integrand is an odd function with respect to the p'-coordinate and so it 
is possible to treat the singularities with Gaussian quadrature (which is symmetric). In Tab. [3] 
we give the relative and absolute errors for a N = 10^ and N = 50^ grid, where we now evaluate 
the integrals in (j34p exactly by using the formulas from [15] , Numerical Micromagnetics: Finite 
Difference Methods. 

Essentially there is not any difference in accuracy between the direct integration according to 
([3l|) and the approach in this paper, at least for grid-sizes where the used sine- quadrature is 
guaranteed to be robust with respect to p, see also Sec. 12. 2i 

The same accuracy statement holds for the 0(A^^/^)-algorithm since no further approximation 
is necessary to derive ([20]) out of (fT9]) . unless magnetization is given in Cn,r- 

3.2 Sublinearity and data-compression in the CP-case 

In order to demonstrate the reduction of computational complexity to sublinearity in the volume 
size we first assume a constant magnetization distribution in the z - direction of the unit-cube, 
i.e., M = (0, 0, M^), where = Mg ruz- We set the saturation magnetization to 1 and compare 
the results with the exact solution, then given by an energy density of -Edemag/ /"o = rn1/6. Fig.[T] 
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Figure 1: Results for M = (0, 0, M^) on the unit cube. Absolute error in the energy for 
n = 20 . . . 300 (number of computational cells is A'^ = re^) versus n, the number of discretization 
points in one direction. In addition, cpu times versus n are plotted for the computation of 
the scalar potential. The curve marked by V shows the case with magnetization tensor given 
in dense tensor format, the curve marked by A corresponds to the case with magnetization in 
canonical format. 
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Table 4: Computational complexity for calculation of the scalar potential in CP format (ran- 
domly assembeled magnetization, i.e. M^^^ G C„,,5). Cpu-times (in sec) averaged for 20 identical 
experiments each and given for increasing n and R. In all computations the number of Gaussian- 
quadrature nodes is 30. 
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shows the absolute errors in the energy and the cpu times for computation of the scalar potential. 
The error decreases with order about 1.6, the cpu times increase with order ~ 3.6 up to 4 in 
the case where a dense tensor format is used for the magnetization tensor, and with an order 
between 2.4-2.6 (sublinear), when the magnetization is represented in canonical tensor format 
(rp = 1, p = 1 . . . 3). We have used 50 Gauss-Legendre quadrature terms and increased the 
number of sine terms R according to 35 -|- 3n/20, so using 38 up to 80 terms. 

For the purpose of verification of the asymptotic operation count of Sec. 12.61 we compute 
the magnetic scalar potential for randomly assembled p-component magnetization tensors of 
rank 5, i.e. M^*'-* G C„.5, and measure the cpu-times (averaged for 20 experiments each) with 
respect to increasing R and mesh-parameter n. In Tab. S] we can observe the linear increase in 

Ar2/3 = ^2 

We next consider a flower-like magnetization state that allows a sufficiently good low rank 
approximation in the CP-format without severe loss of accuracy. The main magnetization direc- 
tion is taken to be along the z - axis, and the flower is obtained through an in-plane perturbation 
along the y - axis and an out-of-plane perturbation along the x - axis. Assuming polynomial ex- 
pressions for the perturbations, as in [7], our flower is the normalized version of 

M^(r) = l{x-x^){z- Zm), 

MHr) = -,{y - ym){z -z^) + ^{y- y„,f{z - z^f , (37) 
M^(r) = 1, 

where Xm,ym and Zm are the coordinates of the center of the cube. We choose a = c = 0.5 
and 6=1 and generate a dense magnetization state on a = 100'^ tensor product grid with 
self-energy 1.418772 e — 01 [/Iq^M"^], computed with the optimized parameter cq for the sinc- 
quadrature (see Sec. I2.2p and 50 nodes for the Gaussian quadrature. 

We now approximate the above dense magnetization in CP-format by using an alternating least 
squares algorithm that scales almost linear in N , see e.g. [14J. Choosing Kronecker ranks r = 5 
for each magnetization component, i.e. M(p) e Cioo,5,P = 1...3, results in a relative Lo- 
approximation error of less than 1 e — 06. The data storage requirements for the magnetization 
tensors have been compressed by a factor of 1.5 e — 03. For the field computation we use the 
less accurate sine- quadrature with R = 35 and only 10 Gaussian quadrature nodes. Compared 
to the dense and more accurate computation, the CP-approximation only results in a relative 
error in the energy of 1.8e — 04 and 2.6 e — 05 in the relative L2-error norm for the magnetic 
scalar potential. The storage requirements for the stray field are about 13.5 % of that for the 
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Table 5: Absolute error in the energy between results for direct tensor integration algorithm and 
FEM/BEM for uniform magnetization distribution in the unit cube. N indicates the number 
of total nodes in the mesh. In the fourth column we give the rel. deviation of the energy 
values computed by the two numerical schemes; percentage is based on the true value, i.e. 



N 


error (tensor) 


error (FEM/BEM) 


deviation [%] 


15^ 


1.38 e-04 


1.55 e-03 


8.50e-l 


30^ 


8.19 e-05 


4.51 e-04 


3.20 e-1 


60^ 


3.98 e-05 


3.47 e-04 


2.32 e-1 



dense case. 

Computation in the CP-format based on algebraic compression, like in the above example, 
might not work for arbitrary magnetization and also needs a setup-phase that scales with 
(9(7V^/3), unless tensor AC A is used like in [10 J. CP-based schemes are therefore particulary 
efficient if the input-magnetization has CP-structure and can be preserved by the algorithm. 
Subsequent work will concentrate on fast and data-sparse static micromagnetic simulations in 
the space Cn,r (minimization in the CP-format), using the tensor approach in this paper for 
magnetostatic energy computation. 

3.3 Comparison with FEM/BEM 

Like in the previous section we first compare the proposed scheme of this paper with that ob- 
tained by the finite element simulation package FEMME [1] in the case of uniform magnetization 
where no dicretization error for the magnetization arises. The used FEM/BEM implemention 
solves the weak formulation of the magnetostatic Poisson equation. Dense boundary element 
matrices are approximated in the H-matrix format and preconditioned iterative linear solvers 
are used to gain an almost linear complexity in the volume size, see e.g. [IF], Numerical Methods 
in Micromagnetics (Finite Element Method). 

From Tab. Oone can see that the FEM/BEM algorithm approximates in the case of uniform 
magnetization configuration about one order of magnitude worse than the direct tensor inte- 
gration algorithm. 

We now take a vortex-like state in a 200 nm^- cube, described by the model in [9], i.e. 

M^(r)= -f (l-exp(-4j;J))^ 

M^(r)= ^(l-exp(-4j;^))i (38) 
M-(r)= exp(-2^), 



where r = y'x^ + y^, and we choose the radius of the vortex core as rc = 28 nm. The vortex 
center coincides with the center of the cube, and the magnetization is assumed to be rotationally 
symmetric about the x = y = 100 nm axis and translationally invariant along the z - axis. For 

the above configuration ()38p and an amount of 50^ nodes and about 5 • 50'^ tetrahedral elements, 
FEMME finds Edemag/fJ-o = 2.16185 e — 02, which we take as reference value. We compare this 
value with computations using the direct tensor integration algorithm introduced in Sec. [2] on 
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Table 6: Absolute and relative deviation in the energy between results for direct tensor inte- 
gration algorithm and finite element reference- value for magnetization distribution of a vortex 
in the unit cube given by Eqs. (|38p . In order to resolve the vortex, we use a non-uniform grid 
(geometrically refined towards the center of the cube). The columns to the right show the 
minimal grid size, i.e. min/ij, in the center of the cube and respectively the maximal value, i.e. 
max hj, next to the boundaries. 



n 


abs. deviation 


rel. deviation [%] 


grid-min 


grid-max 


10 


2.02 e-04 


9.32 e-01 


8.9 e-02 


1.1 e-01 


20 


1.61 e-04 


7.42 e-01 


3.3 e-02 


7.2 e-02 


30 


5.58 e-05 


2.58 e-01 


1.3 e-02 


6.6 e-02 


40 


6.65 e— 06 


3.07 e-02 


5.5 e-03 


6.6 e-02 


50 


1.78 e-06 


8.23 e-03 


2.8 e-03 


6.4 e-02 



Table 7: Absolute and relative deviation between results for direct tensor integration algorithm 
and finite element reference-value for magnetization distribution in the unit cube given by 

Eqs. daiD. 



n 


abs. deviation 


rel. deviation [%] 


20 


3.42 e-04 


2.24 e-01 


30 


2.83 e-04 


1.85 e-01 


40 


2.43 e-04 


1.59 e-01 


50 


2.18 e-04 


1.43 e-01 


80 


1.83 e-04 


1.20 e-01 



an adaptive mesh refined geometrically, in each spatial direction, towards the vortex center of 
the cube, see Tab. [H 

Finally we compare our algorithm with FEMME for a flower-like state of the cube in the 
previous example, where we choose a = c = 1 and 6 = 2 in Eq. (|37|) . 

Using the same finite element mesh as in the previous example, FEMME now finds -Edemag/^o = 
1.52653 e— 01, which we again take in order to compare both approximations. Tab.[7]shows abso- 
lute and relative deviations, in this case on a uniform grid used for the direct tensor integration 
algorithm. One can observe a similar difference like in Tab. [5j 

4 Conclusions 

We have shown, both theoretically and via numerical experiments, that the algorithm introduced 
in this paper allows computation of the magnetostatic field and energy in reduced complexity 
(below linear effort in the number of computational cells used) when magnetization tensors are 
given in canonical format. We expect that, in the future, the tensor approximation approach 
can be used for computing equilibrium states for a well-defined initial magnetization given in 
canonical format. 
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A Background on tensor formats 

Here we briefly review some basic facts about tensor formats, see e.g. [13], [3] for more details. 
Specifically, we consider the case of a 3-d tensor A E ]^"ix"2x?i3_ 

For a matrix U G M'^^'^j the j-mode matrix product A XjU of the tensor A with U is 
defined element-wise in the following way. E.g. for j = 1, 

ni 

(A X I U)i-^ 42 43 ^ ^ Oj' 22 43 Uii i' ) (39) 

i'=l 

i.e., the resulting tensor A Xi [/ G ]K™x"2xn3 jg obtained by right-multiplication of the 1-mode 
fibers (columns) of A by U. Analogously for j = 2,3; the cost for the computation oi AxjU 
is 0(m Jl^^^ Uj) operations in general. 

The tensor A is said to be in Tucker format (Tucker tensor) if it is represented in the form 

A = CXi[/iX2C/2X3C/3, (40) 

with the so-called core tensor C G ]^mixm2xr?i3 ^^^^ matrices Uj G IR"j^™j . 

The tensor A is said to be in canonical format (CANDECOMP/PARAFAC (CP) decompo- 
sition) with (Kronecker) rank R, if 

A = f]A, n«onf 0^3) (41) 

r=l 

with Xr G M, unit vectors ui''^ G M"^ , and o is the vector outer product. Abbreviating notation 
as in ^4J, a tensor in CP format is written as 

A = lA; f/(i),?7(2),C/(3)j, (42) 

with weight vector A = [Ai, . . . , A/j] G and matrices U^^^ = [u^^^ | . . . | ] G W^''^. The 
storage requirement for the canonical tensor format amounts to 0{R Yl'j=i^j)- 
In the following we write Cn,r for the set of canonical tensors with mode length n = (ni, n2, 713) 
and rank r, and simple Cn,r, when the mode-lengths are equal. 

If the core tensor in Eq. (|40p is given in canonical tensor format, i.e. C = JA; V^-^^ V^'^\ V^^^} 
with y^-') G M^J^^, one can easily transform Eq. (j40p into CP format for a cost of 0{R Yl'j=i ^j) 
operations, giving 

A = [A; ^«y«,C/(2)y(2),^(3)y(3)j, (43) 

The inner product for two canonical tensors A G Cn,ri and B G Cn,r2 ; as well as many 
other operations, can be performed with reduced complexity (for the inner product operation 
it acounts to 0{rir2 Uj) ), see e.g. [3]. The reduced complexity (and also the data-sparsity) 
makes it worth developing algorithms in CP format. 
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